98 research outputs found

    Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities

    Get PDF
    Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (~40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families, and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology, and also identifies putative regulatory roles for unstudied TFs

    Probing scrambling using statistical correlations between randomized measurements

    Get PDF
    We propose and analyze a protocol to study quantum information scrambling using statistical correlations between measurements, which are performed after evolving a quantum system from randomized initial states. We prove that the resulting correlations precisely capture the so-called out-of-time-ordered correlators and can be used to probe chaos in strongly-interacting, many-body systems. Our protocol requires neither reversing time evolution nor auxiliary degrees of freedom, and can be realized in state-of-the-art quantum simulation experiments.Comment: This version v2 (8 pages, 7 figures) includes important new results compared to our original submission. (1) We present a protocol and corresponding mathematical proof to access OTOCs with local operations, and which can be realized in quantum simulation experiments with available technology. (2) We illustrate the realization of the protocols with different examples for Hubbard and spin model

    A novel parametric approach to mine gene regulatory relationship from microarray datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray has been widely used to measure the gene expression level on the genome scale in the current decade. Many algorithms have been developed to reconstruct gene regulatory networks based on microarray data. Unfortunately, most of these models and algorithms focus on global properties of the expression of genes in regulatory networks. And few of them are able to offer intuitive parameters. We wonder whether some simple but basic characteristics of microarray datasets can be found to identify the potential gene regulatory relationship.</p> <p>Results</p> <p>Based on expression correlation, expression level variation and vectors derived from microarray expression levels, we first introduced several novel parameters to measure the characters of regulating gene pairs. Subsequently, we used the naïve Bayesian network to integrate these features as well as the functional co-annotation between transcription factors and their target genes. Then, based on the character of time-delay from the expression profile, we were able to predict the existence and direction of the regulatory relationship respectively.</p> <p>Conclusions</p> <p>Several novel parameters have been proposed and integrated to identify the regulatory relationship. This new model is proved to be of higher efficacy than that of individual features. It is believed that our parametric approach can serve as a fast approach for regulatory relationship mining.</p

    PhenoM: a database of morphological phenotypes caused by mutation of essential genes in Saccharomyces cerevisiae

    Get PDF
    About one-fifth of the genes in the budding yeast are essential for haploid viability and cannot be functionally assessed using standard genetic approaches such as gene deletion. To facilitate genetic analysis of essential genes, we and others have assembled collections of yeast strains expressing temperature-sensitive (ts) alleles of essential genes. To explore the phenotypes caused by essential gene mutation we used a panel of genetically engineered fluorescent markers to explore the morphology of cells in the ts strain collection using high-throughput microscopy. Here, we describe the design and implementation of an online database, PhenoM (Phenomics of yeast Mutants), for storing, retrieving, visualizing and data mining the quantitative single-cell measurements extracted from micrographs of the ts mutant cells. PhenoM allows users to rapidly search and retrieve raw images and their quantified morphological data for genes of interest. The database also provides several data-mining tools, including a PhenoBlast module for phenotypic comparison between mutant strains and a Gene Ontology module for functional enrichment analysis of gene sets showing similar morphological alterations. The current PhenoM version 1.0 contains 78 194 morphological images and 1 909 914 cells covering six subcellular compartments or structures for 775 ts alleles spanning 491 essential genes. PhenoM is freely available at http://phenom.ccbr.utoronto.ca/

    Incorporating functional inter-relationships into protein function prediction algorithms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional classification schemes (e.g. the Gene Ontology) that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches.</p> <p>Results</p> <p>We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the <it>k</it>-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of predictions made over a set of base classes. Finally, we provide qualitative and quantitative evidence that this incorporation of functional inter-relationships enables the discovery of interesting biology in the form of novel functional annotations for several yeast proteins, such as Sna4, Rtn1 and Lin1.</p> <p>Conclusion</p> <p>We implemented and evaluated a methodology for incorporating interrelationships between functional classes into a standard classification-based protein function prediction algorithm. Our results show that this incorporation can help improve the accuracy of such algorithms, and help uncover novel biology in the form of previously unknown functional annotations. The complete source code, a sample data set and the additional files for this paper are available free of charge for non-commercial use at <url>http://www.cs.umn.edu/vk/gaurav/functionalsimilarity/</url>.</p

    The Complete Spectrum of Yeast Chromosome Instability Genes Identifies Candidate CIN Cancer Genes and Functional Roles for ASTRA Complex Components

    Get PDF
    Chromosome instability (CIN) is observed in most solid tumors and is linked to somatic mutations in genome integrity maintenance genes. The spectrum of mutations that cause CIN is only partly known and it is not possible to predict a priori all pathways whose disruption might lead to CIN. To address this issue, we generated a catalogue of CIN genes and pathways by screening ∼2,000 reduction-of-function alleles for 90% of essential genes in Saccharomyces cerevisiae. Integrating this with published CIN phenotypes for other yeast genes generated a systematic CIN gene dataset comprised of 692 genes. Enriched gene ontology terms defined cellular CIN pathways that, together with sequence orthologs, created a list of human CIN candidate genes, which we cross-referenced to published somatic mutation databases revealing hundreds of mutated CIN candidate genes. Characterization of some poorly characterized CIN genes revealed short telomeres in mutants of the ASTRA/TTT components TTI1 and ASA1. High-throughput phenotypic profiling links ASA1 to TTT (Tel2-Tti1-Tti2) complex function and to TORC1 signaling via Tor1p stability, consistent with the role of TTT in PI3-kinase related kinase biogenesis. The comprehensive CIN gene list presented here in principle comprises all conserved eukaryotic genome integrity pathways. Deriving human CIN candidate genes from the list allows direct cross-referencing with tumor mutational data and thus candidate mutations potentially driving CIN in tumors. Overall, the CIN gene spectrum reveals new chromosome biology and will help us to understand CIN phenotypes in human disease

    Inferring the role of transcription factors in regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays.</p> <p>Results</p> <p>We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of <it>E. coli </it>extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to <it>S. cerevisiae </it>transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions.</p> <p>Conclusion</p> <p>Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.</p

    An Integrative Multi-Network and Multi-Classifier Approach to Predict Genetic Interactions

    Get PDF
    Genetic interactions occur when a combination of mutations results in a surprising phenotype. These interactions capture functional redundancy, and thus are important for predicting function, dissecting protein complexes into functional pathways, and exploring the mechanistic underpinnings of common human diseases. Synthetic sickness and lethality are the most studied types of genetic interactions in yeast. However, even in yeast, only a small proportion of gene pairs have been tested for genetic interactions due to the large number of possible combinations of gene pairs. To expand the set of known synthetic lethal (SL) interactions, we have devised an integrative, multi-network approach for predicting these interactions that significantly improves upon the existing approaches. First, we defined a large number of features for characterizing the relationships between pairs of genes from various data sources. In particular, these features are independent of the known SL interactions, in contrast to some previous approaches. Using these features, we developed a non-parametric multi-classifier system for predicting SL interactions that enabled the simultaneous use of multiple classification procedures. Several comprehensive experiments demonstrated that the SL-independent features in conjunction with the advanced classification scheme led to an improved performance when compared to the current state of the art method. Using this approach, we derived the first yeast transcription factor genetic interaction network, part of which was well supported by literature. We also used this approach to predict SL interactions between all non-essential gene pairs in yeast (http://sage.fhcrc.org/downloads/downloads/predicted_yeast_genetic_interactions.zip). This integrative approach is expected to be more effective and robust in uncovering new genetic interactions from the tens of millions of unknown gene pairs in yeast and from the hundreds of millions of gene pairs in higher organisms like mouse and human, in which very few genetic interactions have been identified to date

    A Role for Phosphatidic Acid in the Formation of “Supersized” Lipid Droplets

    Get PDF
    Lipid droplets (LDs) are important cellular organelles that govern the storage and turnover of lipids. Little is known about how the size of LDs is controlled, although LDs of diverse sizes have been observed in different tissues and under different (patho)physiological conditions. Recent studies have indicated that the size of LDs may influence adipogenesis, the rate of lipolysis and the oxidation of fatty acids. Here, a genome-wide screen identifies ten yeast mutants producing “supersized” LDs that are up to 50 times the volume of those in wild-type cells. The mutated genes include: FLD1, which encodes a homologue of mammalian seipin; five genes (CDS1, INO2, INO4, CHO2, and OPI3) that are known to regulate phospholipid metabolism; two genes (CKB1 and CKB2) encoding subunits of the casein kinase 2; and two genes (MRPS35 and RTC2) of unknown function. Biochemical and genetic analyses reveal that a common feature of these mutants is an increase in the level of cellular phosphatidic acid (PA). Results from in vivo and in vitro analyses indicate that PA may facilitate the coalescence of contacting LDs, resulting in the formation of “supersized” LDs. In summary, our results provide important insights into how the size of LDs is determined and identify novel gene products that regulate phospholipid metabolism

    Linking Yeast Gcn5p Catalytic Function and Gene Regulation Using a Quantitative, Graded Dominant Mutant Approach

    Get PDF
    Establishing causative links between protein functional domains and global gene regulation is critical for advancements in genetics, biotechnology, disease treatment, and systems biology. This task is challenging for multifunctional proteins when relying on traditional approaches such as gene deletions since they remove all domains simultaneously. Here, we describe a novel approach to extract quantitative, causative links by modulating the expression of a dominant mutant allele to create a function-specific competitive inhibition. Using the yeast histone acetyltransferase Gcn5p as a case study, we demonstrate the utility of this approach and (1) find evidence that Gcn5p is more involved in cell-wide gene repression, instead of the accepted gene activation associated with HATs, (2) identify previously unknown gene targets and interactions for Gcn5p-based acetylation, (3) quantify the strength of some Gcn5p-DNA associations, (4) demonstrate that this approach can be used to correctly identify canonical chromatin modifications, (5) establish the role of acetyltransferase activity on synthetic lethal interactions, and (6) identify new functional classes of genes regulated by Gcn5p acetyltransferase activity—all six of these major conclusions were unattainable by using standard gene knockout studies alone. We recommend that a graded dominant mutant approach be utilized in conjunction with a traditional knockout to study multifunctional proteins and generate higher-resolution data that more accurately probes protein domain function and influence
    corecore